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Abstract — Two new rate-one full-diversity space-time blocli 
codes (STBC) are proposed. They are characterized by the lowest 
decoding complexity among the known rate-one STBC, arising 
due to the complete separability of the transmitted symbols into 
four groups for maximum likelihood detection. The first and 
the second codes are delay-optimal if the number of transmit 
antennas is a power of 2 and even, respectively. The exact pair- 
wise error probability is derived to allow for the performance 
optimization of the two codes. Compared with existing low- 
decoding complexity STBC, the two new codes offer several 
advantages such as higher code rate, lower encoding/decoding 
delay and complexity, lower peak-to-average power ratio, and 
better performance. 

Index Terms — Orthogonal designs, performance analysis, 
quasi-orthogonal space-time block codes, space-time block codes. 



I. Introduction 

Space-time block codes (STBCQ) have been extensively 
studied since they exploit the diversity and/or the capacity 
of multiple-input multiple-output (MIMO) channels. Among 
various STBC, orthogonal STBC (OSTBC) [l]-[3] offer the 
minimum decoding complexity and full diversity. However, 
they have low code rates when the number of transmit (Tx) 
antennas is more than 2 [3]. The rate of one symbol per 
channel use (pcu) only exists for 2 Tx antennas and the rate 
approaches 1/2 for a large number of Tx antennas [l]-[3]. 

To improve the low rate of OSTBC, several quasi- 
orthogonal STBC (QSTBC) have been proposed (see [4]-[7] 
and references therein). They allow joint maximum likelihood 
(ML) decoding of pairs of complex symbols. However, the 
rate-one QSTBC exist for 4 Tx antennas only and the code 
rate is smaller than 1 for more than 4 Tx antennas. Several 
rate-one STBC have been proposed (e.g. [8]-[10]), in which 
the transmitted symbols can be completely separated into 
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two groups for ML detection. However, for more than 4 Tx 
antennas, the decoding complexity of the rate-one STBC in 
[8]-[10] increases significantly compared with OSTBC and 
QSTBC. 

In this paper, we propose two new rate-one STBC for any 
number of Tx antennas. Compared with the existing rate-one 
STBC, our new codes have lowest decoding complexity since 
the transmitted symbols can be decoupled into 4 groups (4Gp) 
for ML detection. The first code is called 4Gp-QSTBC. The 
second code is derived from semi-orthogonal algebraic space- 
time (SAST) codes [10] and thus called 4Gp-SAST codes. The 
first and the second codes are delay-optimal when the number 
of Tx antennas is a power of 2 and even, respectively. The 
equivalent transmit-receive signals are derived so that sphere 
decoders [11] can be applied for data detection. To achieve 
full-diversity, signal rotations are required for the two codes. 
The exact pair-wise error probability (PEP) of the two codes 
is derived to optimize the signal rotations. 

We compare the main parameters of our new codes and 
several existing STBC for 6 and 8 Tx antennas in Table 
I] Clearly, the new codes offer several distinct advantages 
such as higher code rate, low decoding complexity, and lower 
encoding/decoding delay. The two new codes also have lower 
peak-to-average power ratio (PAPR) than OSTBC, QSTBC, 
and minimum decoding complexity (MDC) QSTBC [12]. 
Moreover, simulation results show that our new codes also 
yield significant SNR gains compared with the existing codes. 

Notation: Superscripts ^, *, and t denote matrix transpose, 
conjugate, and transpose conjugate, respectively. The identity 
and all-zero square matrices of proper size are denoted by 
/ and 0. The diagonal matrix with elements of vector x on 
the main diagonal is denoted by diag(a;). ||X||f stands for the 
Frobenius norm of matrix X and ® denotes Kronecker product 
[13]. A mean-m and variance-cr'^ circularly complex Gaussian 



TABLE I 

Comparison of Several Low Complexity STBC for 6 and 8 
Antennas. The Numbers in the Parentheses Indicate the Codes' 
Parameters for 8 Tx Antennas. 



Codes 


Maximal rate 


Delay 


Real symbol decoding 


OSTBC [3], [24] 


2/3 (5/8) 


30 (56) 


1 or 2 (1 or 2) 


CIOD [17] 


6/7 (4/5) 


14 (50) 


2 (2) 


MDC-QSTBC [12] 


3/4 (3/4) 


8 (8) 


2 (2) 


QSTBC [6] 


3/4 (3/4) 


8 (8) 


4(4) 


2Gp-QSTBC [8] 


1 (1) 


8 (8) 


8 (8) 


SAST [10] 


1 (1) 


6(8) 


6(8) 


4Gp-QSTBC (new) 


1 (1) 


8(8) 


4(4) 


4Gp-SAST (new) 


1 (1) 


6(8) 


3(4) 
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random variable is written by CAf{ni,(j'^). 5R(X) and 
denote the real and imaginary parts of X, respectively. 

II. System Model and Preliminaries 

A. System Model 

We consider data transmission over a MIMO quasi-static 
Rayleigh flat fading channel with M Tx and N receive (Rx) 
antennas [14]. The channel gain hmn [m = 1, 2, . . . , Af ; n = 
l,2,...,iV) between the (rn,n)-th Tx-Rx antenna pair is 
assumed CA/'(0, 1) and remains constant over T time slots. 
We assume no spatial correlation at either Tx or Rx array. 
The receiver, but not the transmitter, completely knows the 
channel gains. 

AT X M STBC can be represented in a general dispersion 
form [14] as follows: 



K 



(1) 



fe=i 



where Ak and Bk, (k = 1, 2, • • • , if) are T x Af constant 
matrices, commonly called dispersion matrices; ak and bk are 
the real and imaginary parts of the symbol Sk- We can use an 
equivalent form of STBC as 



(2) 



1=1 



where L is the number (not necessarily even) of transmit- 
ted symbols, c; are real-value transmitted symbols, C; are 
dispersion matrices. The average energy of code matrices is 
constrained such that £x = ]E[||X||p] = T. 

The received signals yt„ of the nth antenna at time t can be 
arranged in a matrix Y of size Tx N. Thus, one can represent 
the Tx-Rx signal relation as [14], [15] 



Y = y^XH + Z 



(3) 



where H = [hmii] is the channel matrix; Z = [ztn] is the 
noise matrix of size TxN, its elements ztn are independently, 
identically distributed (i.i.d.) CA/^(0, 1). The Tx power is scaled 
by p so that the average signal-to-noise ratio (SNR) at each 
Rx antenna is p, independent of the number of Tx antennas. 

Let the data vector be c = [ci C2 . . . cl] . The ML 
decoding of STBC is to find the solution c so that: 



c = are; mill \\Y — XH\ 



(4) 



B. Algebraic Constraints of QSTBC 

The key idea of QSTBC is to divide the L (real) transmitted 
symbols embedded in a code matrix into F groups, so that the 
ML detection of the transmitted symbol vector can be decou- 
pled into r sub-metrics, each metric involves the symbols of 
only one group [6], [8], [10], [16]. We provide a definition of 
STBC with this feature to unify the notation in this paper as 
follows. 

Definition 1: A STBC is said to be T- group decodable 
STBC if the ML decoding metric Q can be decoupled into 
a linear sum of T independent submetrics, each submetric 



consists of the symbols from only one group. The T-group 
decodable STBC is denoted by TGp-STBC for short. 

In the most general case, we assume that there are F groups; 
each group is denoted by Hi [i = 1,2,...,F) and has Li 
symbols. Thus L = X^iLi Li. Let 6^ be the set of indexes of 
symbols in the group il^. 

Yuen et al. [16, Theorem 1] have shown a sufficient 
condition for a STBC to be F-group decodable. In fact, this 
condition is also necessary. We will state these results in the 
following theorem without proof for brevity. 

Theorem 1: The necessary and sufficient conditions, so that 
a STBC is T-group decodable, are 



CpCq 



(5) 



Note that Theorem[T]covers [17, Theorem 9] (single-symbol 
decodable STBC) and can be shown similarly. 

III. Four-group Decodable STBC Derived from 
QSTBC 

A. Encoding 

In this section, we will study the new 4Gp-QSTBC. As we 
will see later, the general form of STBC in ([T]i is convenient 
for studying 4Gp-QSTBC; hence Theorem [T] can be restated 
as follows. 

Lemma 1 ([18]): The necessary and sufficient conditions 
for a STBC in ([T]l to become T-group decodable are: (a) 
'A^Ag + AlAg = 0, (b) B^Bg + BlBg = 0, and (c) 
AlBg + B^Ag = 0, Vp e e„yq e Oj, 1 < i 7^ j < F. 

We next consider another sufficient condition so that a 
STBC is four-group decodable. 

Theorem 2: Given a 4Gp-STBC for M Tx antennas with 
code length T and K sets of dispersion matrices {Ak^ Bk] 1 < 
k < K), a 4Gp-STBC with code length 2T for 2M Tx 
antennas, which consists of 2K sets of dispersion matrices 
denoted as {Ai,Bi), 1 < i < 2K, can be constructed using 
the following mapping rules: 



A 



2k-l 



B' 



2fc-l 



'Ak 





, A2k = 


Bk 


" 





Ak_ 





Bk_ 


" 


Ak 


, B2k = 


■ 


Bk 


Ak 





Bk 






(6) 



Proof: The 

if the dispersion matrices {Ag,Bg){l < q < K) satisfy 
Lemma [T] with {Ap^ Bp) {1 < p < K) where q ^ Qp, 
then the dispersion matrices (^2g-i, ^2^-1, v42q, ^29) con- 
structed from {Ag,Bg) using (|6]l will satisfy Theorem |2] with 
{A2P-1, B2p^i, A2p, B2p) constructed from {Ap,Bp) using 
(|6]l. The detailed proof is omitted here, as the steps are routine. 

□ 

The recursive construction of 4Gp-STBC specified in The- 
orem |2] suggests that we can start with the MDC-QSTBC for 
4 Tx antennas proposed in [12] to construct 4Gp-STBC for 
8, 16 Tx antennas and so on, because MDC-QSTBC is one 
of the STBC satisfying Lemma [U the resulting STBC is thus 
called 4Gp-QSTBC. For practical interest, we will illustrate 
the encoding process of 4Gp-QSTBC for 8 Tx antennas from 
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the MDC-QSTBC for 4 Tx antennas [12]. The code matrix of 
MDC-QSTBC for 4 Tx antennas is 



Fa 



where j 



fli 

-a2 - 
hi 
-h2 



-J as 
-ja4 

-ih 



02 ^ 



-ja4 

-ih 



h 
-62 

fli - 
-a2 - 



-ja4 



62 + j &4 

02 + j 04 
ai - jas 



(7) 



-1. 



The code matrix of 4Gp-QSTBC for 8 Tx antennas from 



F4 usin 


g mapping rules in ^ is 


given below: 






ai 4 




03 4- jar 


a2 4 


-jae 


o,A 4- as 




-03 4 


-jay 


oi - ja5 


— a4 4 


-jas 


ao — lac 




02 4 


-joe 


04 +ja8 


ai 4 


-ja5 


"3 J "7 




-04 4 


-jas 


02 - j fle 


-03 4 


-ja7 


ai - ja5 


61 4 


-j^5 


63 + j 67 


62 H 


hj&6 


64 + j fcg 




-b3 4 


-j^7 


&i - J &5 


-64 H 


hj&8 


62 - j 65 




&2 H 




bi+jbs 


6H 


hj&5 


63 + j 67 




-64 H 


-j&8 


&2 - j ^6 


-63 H 


hjfer 


fci - j &5 


bi 


+ j&5 


63 + j 67 


62 


+ j&6 


64 + j feg 




+ j67 


5i - j 65 


-64 


+ j&8 


62 - j ^6 




+ j66 


bi+ibs 


61 


+ j&5 


63 + j 67 


-64 


+ j68 


b2-]be 


-63 


+ j&7 


- j ^5 


ai - 






a2 - 


-l-jae 


04 4- j as 


-03 - 


-l-jay 


ai - j as 


—04 - 


-f-jas 


02 - j ag 


€12 ^ 




04 + j as 


ai - 


-f-jas 


03 4-ja7 


—04 - 


f jag 


02 - j ae 


-03 - 


f ja7 


oi - jas 



(8) 

The code rate of 4Gp-QSTBC for 8 Tx antennas is one 
symbol pcu. In general, by construction, the rate of 4Gp- 
QSTBC for 2M Tx antennas is the same as the rate of MDC- 
QSTBC for M Tx antennas. The maximal rate of MDC- 
QSTBC is one symbol pcu [12], the maximal achievable rate 
of 4Gp-QSTBC is also one symbol pcu for 2™ Tx antennas. If 
the number of Tx antennas is M < 2™ (m = 2, 3, . . .), then 
(2™ — M) columns of the code matrix for 2™ Tx antennas 
can be deleted to obtain the code for M antennas. Thus, the 
maximum rate of 4Gp-QSTBC is one symbol pcu and it is 
achievable for any number of Tx antennas. Additionally, the 
4x4 code matrix F4 is square. By recursive construction 
the code matrices of 4Gp-QSTBC are also square for 2™ Tx 
antennas; and therefore, 4Gp-QSTBC are delay optimal if the 
number of Tx antennas is 2™ [17]. 

B. Decoding 

We know that the symbols si, 32,53,34 of F^i can be 
separately detected [12]. Therefore, from Theorem |2l the 4 
groups of 8 symbols of Fg, can be detected independently. 
These 4 groups ai-e (si, S2), (53, S4), (s5, se), and (57, s^). The 
ML metric given in (|4]l can be derived to detect the 4 groups 
of symbols of Fg,. However, to provide more insights into 
the decoding of 4Gp-QSTBC, we will derive an equivalent 
code and the equivalent channel of F^. Furthermore, using 
the equivalent channel of Fs, we can use a sphere decoder 
[11] to reduce the complexity of the ML search. 

The equivalent code of Fg is obtained by column per- 
mutations for the code matrix of Fg in (O: the order of 



columns is changed to (1, 3, 5, 7, 2, 4, 6, 8). This order 
of permutations is also applied for the rows of Fg. Let xi — 

ai 4- jas, 2:2 = 02 +ja6,X3 = bi 4-j6s,a::4 = 62 +\be,x^ = 
03 + j 07, = 04 + j as, 2:7 = 63 + j &7, 2^8 = ^4 + j ^8 be 
the intermediate variables, we obtain a permutation-equivalent 
code of Fs below 



where 



Vi 



D 



Xi X2 X3 X4 

X2 Xi Xi X3 

0:3 Xi Xi X2 

Xi Xz X2 Xi 



Vi V2 

-V*2 V*i 



V2 



(9) 



Xs 


xe 


X7 


Xs 


Xq 


X5 


Xs 


X7 


Xj 


Xs 


X5 


Xe 


Xs 


Xj 


Xe 


Xs 



(10) 



The sub-matrices Vi and I?2 have a special form called block- 
circulant matrix with circulant blocks [13]. 

We next show how to decode the code D. For simplicity, 
a single Rx antenna is considered. The generalization for 
multiple Rx antennas is straightforward. Assume that the Tx 
symbols are drawn from a constellation with unit average 
power, the Tx-Rx signal model in ^ for the case of STBC 
D follows 



(11) 



Let 



X 



Zi zl 



\xx X2 



a;8j 
z 



hi 
h2 

hi 



h2 
hi 
hi 
hz 



hz 
hi 
hi 
h2 



hi 
hz 
h2 
hi 



and 



/is 


he 


hi 


hs 


he 


ft-s 


hs 


hi 


hr 


hs 


ft-s 


he 


hs 


hi 


he 


/is 



We have an equivalent expression of ( fTTI ) as 



y = 



Hi n2 

n2 ^ rti 
fi 



(12) 



(13) 



Note that Tii and 7^2 are block-circulant matrices with 
circulant-blocks [13]. Thus, they are commutative and so do 
TiX and 7^2- We can multiply both sides of ( fT3] l with fi^ to 
get 




n*i'Hi + n*2n2 o 

71*1711+ mm _ 



x + 7i^z. 

Z 

(14) 



It can be shown that the noise elements of vector z are 
correlated with covariance matrix Ti^Ti. Thus this noise vector 
can be whitened by multiplying both side of (fT4b with the 
matrix (H^Ti)"^/^- Let H = 7l*i 7li + 7I2 7i2- After the noise 
whitening step, (fT4l i is equivalent to the following equations 



7i 



-l/2__ 



Vi 



IP " 1/2 

77 3;,; 4- Zi 



(* = 1,2), 



(15) 



IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. X, NO. X, X 200X 



4 



3 2^41-2 2^41-1 Xi. 
1/2 



where y 

[Xii- 

~ -1/2 r_ 

Zi = n [zii-3 

and have elements 



y4i-2 2;4j-1 ?/4iJ 

, the noise vectors 

Z4i-2 24i-l ^4i] 

CA/'(0, 1). 



are uncorrelated 



At this point, the decoding of the 8 transmitted symbols of 
the code D can be readily decoupled into 2 groups. However, 
since the code is a 4Gp-STBC, we can further decompose 
them into 4 groups in the following. 

Denote the 2x2 (real) discrete Fourier transform (DFT) 

The block-circulant matrices Tii 



1 



-1 



matrix by J^2 

and 7^2 can be diagonalized by a (real) unitary matrix Q = 
\T2®T2 [13, Theorem 5.8.2, p. 185]. Note that 6+ = 9, 
therefore, TYi = 6Ai9 and 7^2 = 6A26, where Ai and A2 
are diagonal matrices, with eigenvalues of Tii and 7^2 in the 
main diagonal, respectively. Thus, 7i = 0(A{Ai + A2A2)8, 



and also fi^'^ = e(A|Ai 
matrix, (fTSl l becomes 

1/2 



AJA2)^/^6. Since j-i^'^ is a real 



u 3?(y.) = - 



+ 3(5.), 



i = 1,2, (16a) 
i = l,2(16b) 



Note that 5R(a;i) — [ai 02 bi 62] := di, i.e. "^{xi) is 
only dependent on the complex symbols si and S2- Similarly, 
3?(a;2), 3(a;i), and 3(a;2) depend on (53, S4), (55, sg), and 
(sy, ss), respectively. 

Eq. ( fTSl l shows that the decoding of 8 transmitted symbols 
of STBC D is separated into the decoding of 4 groups, each 
with two symbols (thus the search space size has been reduced 
from to 4Q^ where Q is the transmit constellation size). A 

sphere decoder [11] can also be used to reduce the complexity 

- 1/2 

of the ML search for each group. The matrix can be 

considered as the equivalent channel of the 4Gp-QSTBC D. 

C. Performance Analysis 

In ( fTSl l, the PEP of the four transmit symbol vectors are 
the same. We thus need to consider the PEP of one of the 
vectors di = 5R(a;i) = [ai 02 bi 62] . For notational 
simplicity, the subindex 1 of di is dropped. Additionally, we 
can introduce redundancy on the signal space by using a 4 x 4 
real unitary rotation R to the data vector [ai 02 foi ^2] • 
Thus the data vector d = i? [a 1 02 61 &2]^- 

From (I16ab . the PEP of the pair d and d can be expressed 
by the Gaussian tail function as [19] 



P{d^~d\fL) 



p\\n 




1/2 



R5\\l 



4Afo 



16 



(17) 



where 8 = d — d, N^) — \ /2 is, the variance of the elements 
of the white noise vector ^{zi) in ( I16ab . 

Remember that Ai is a diagonal matrix with eigenvalues of 
Til on the main diagonal. Let Ai j- (i — 1,2; j — 1,2,3,4) be 



the eigenvalues of Hi- Then Aj = diag (Ai^, Ai,2, Ai,3, Aj,4). 
Let f3 = QRd, we have 



P{d^d\H)^Q 



/p (EliS,ti/?||A..,f ) 

16 



(18) 



To derive a closed form of (fTSj, we need to evaluate 
the distribution of A,;.j. The eigenvectors of Jii is the 
columns of the matrix Q ~ \3^2 ® T2- Thus, the eigen- 
values of Jix are: [A14 Ai,2 Ai,3 Ai^4]^ = {T2 ® 
T2)\hi h2 h3 hiY. Since hj - C7V(0, 1) for {j = 
1, . . . , 4), thus Ai J ~ C7V(0, 4) and so do A2 j. 

We now use the Craig's formula [20] to derive the condi- 
tional PEP in (fTsl l. 



P{d^d\H)^Q 



tt/2 



exp 



/p (ELiE,ti/3||A,,f) 
16 

-p(EliE,ti/3||A.,,f) ^ 
32 sin^ a 



da. (19) 



Applying a method based on the moment generating func- 
tion [19], we obtain the unconditional PEP as: 

-2 



P{d 



TT 



Tr/2 



n 



1 



) sin^ a 



If A ^ QVi = 1, 



, 4, then 1 + 3-^^ 



SNR, the approximation of the exact PEP in ( f2U| l is 



da. (20) 



3-^^ at high 

8 sin^ a ^ 



P{d d) 



l24. 



7r/2 



2^16!p- 
8!8! 



(sin a) da \\_\(3i 



(21) 



The exponent of SNR in (I2TI 1 is -8. This indicates that 
the maximum diversity order of 4Gp-QSTBC is 8 and it 
is achievable if the product distance Hi^i A (^^^ [21] and 
references therein) is nonzero for all possible data vectors. 
Furthermore, at high SNR, the asymptotic PEP becomes very 
tight to the exact PEP. Recall that /3 = 9i?(d - d); thus, 
the product matrix QR is the combined rotation matrix for 
data vector d. Since is a constant matrix, we can optimize 
the matrix R so that the minimum product distance dp.min = 
miny^i^i nt=i l/^fel' where [3 = [8i?(d' — d^)\ is nonzero 
and maximized. 

If the complex signals are drawn from QAM, the (real) 
elements of d are in the set {±1, ±3, ±5, . . .}. The best known 
rotations for QAM in terms of maximizing the minimum 
product distance are provided in [21], [22]. Denoting the 
rotation matrix in [21], [22] by Rbov, the signal rotation 
for our 4Gp-QSTBC is given by 



R — QRbov- 



(22) 



Simulations show that the above vector signal rotation perform 
better than the symbol-wise rotation proposed in [18] (details 
omitted for brevity). We have presented important properties 
of 4Gp-QSTBC. In the next section, we will investigate 4Gp- 
SAST codes. 
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IV. Four-Group Decodable STBC Derived from 
SAST Codes 

A. Encoding 

The SAST code matrix is constructed for M — 2M Tx an- 
tennas using circulant blocks. Two length- A/ data vectors Si = 

[si S2 ... Sm] and S2_ = [sm+i ■Sm+2 • ■ • S2A/] 
are used to generate two M-hy-M circulant matrices [13]. 
Note that the first row of circulant matrix C{x) copies the 
row vector x; the ith row is obtained by circular shift (i — 1) 
times to the right the vector x. The SAST code matrix is 
constructed as 



S = 



c{sl) c(4) 
-cHsl) cHsD 



(23) 



By construction, 4Gp-SAST codes have rate of one symbol 
pcu; the code matrices for an even number of Tx antennas 
are square; thus 4Gp-SAST codes are delay-optimal for even 
number of Tx antennas. 



Applying permutation 11 in (l24l i for the column matrix y^, 
we obtain 



Vi 




■n(yi)" 






. ^2 . 




n(Xi) ii{X2) 

Hi H2 

Hi -Hi 

H 



Si 






S2 


+ 


. ^2 . 



(27) 



where Hi 



c{hl), H2 = c{hl), zi = n(2i), Z2 = zi. 

The elements of Zi and Z2 are ^ CA/'(0, 1), as elements of 
Zi and Z2- We now multiply with both sides of (l27T i. Let 

n^HlHi+HlH2, we get 




n 



A/ 



M 

n 



' H 








+ 


Zl 




^. 




*2. 




Z2 



(28) 



B. Decoder of 4Gp-SAST codes 

Similar to 4Gp-QSTBC, the decoding of 4Gp-SAST codes 
requires two steps. First, the two data vectors Si and S2 are 
decoupled [10]; then, the real and imaginary parts of vectors Si 
and S2 are separated. We provide the detail decoder with only 
one Rx antenna as generalization for multiple Rx antennas can 
be easily done. 

We introduce another type of circulant matrix called left 
ciculant, denoted by Cl{x), where the ith row is obtained by 
circular shifts (i — 1) times to the left for the row vector x. 

Let us define a permutation 11 on an arbitrary MxM matrix 
X such that, the (M — i + 2)th row is permuted with the ith 
row for i — 2,3, ["^1' where [(•)] is the ceiUng function. 
One can verify that 



The covariance matrix of the additive noise vector z is 



n{CLix)) ^ cix) . 



(24) 

T 



Let y = \y{ yl^ , = \yi y2 ... j/a/. 

2/2 = [vm+i yM+2 ■ ■ ■ Vm] , h = [hi hi] , hi = 

[hi h2 ... /Im] ' ^2 = [^A?+l ^^M+2 ■ ■ ■ ^2m] ' 



^M+2 

relation as 



Zl = [zi Z2 ... Za/] , Z2 = 

^2A?] ■ We can write the Tx-Rx signal 



(25) 



C{si) 


C[S2) 




hi' 




Zl 


CKS2) 


C\si)_ 






+ 


Z2 



An equivalent form of 



IS 



Vi 


ry 


'Xi 


X2 




Si 




Zl 


y*2. 


V M 


X3 


Xi 




S2 


+ 


.^2. 



(26) 



where Xi = Cl(/iI),X2 = Cl(/iI),X3 = C^{hl),X4 

-cHhl). 



E[zz^] 



n 



M 



M 

n 



Therefore, the noise vectors Zi and Zg 



are uncorrected and have the same covariance matrix 7^. Thus 
Si and S2 can be decoded separately using y^ = "HSi + Zi, 
i ~ 1,2. The noise vectors Zi and Zg can be whitened by the 

^ —1/2 

same whitening matrix 7i ■ The equivalent equations for 
Tx-Rx signals are 

-1/2 „ , — 777 - 1/2 



n 



\fp/M'k 



- -1/2 , 

H z. 



1,2. (29) 



At this point, the decoding of SAST codes becomes the 
detection of 2 group of complex symbols Si {i — 1,2); this 
is similar to the detection of 4Gp-QSTBC in ( fT5] l. Our next 
step is to separate the real and imaginary parts of vectors Si 
to obtain 4 groups of symbols for data detection. 

Recall that n = + HIH2, and both Hi and H2 

are circulant. Hence, fi is also circulant [13]. Let = 
[Ai,i Xi^2 ■ ■ ■ Xi,m] be the m eigenvalues of Hi {i — 
1,2). We can diagonalize Hi by DFT matrix as Hi — t'^ Ki T- 
Thus j-i = J^1^(A|Ai+A^A2) J^. Let A|Ai+A^A2 = A, then 
A has real and non-negative entries in the main diagonal and 
h'/' = _^ and n^^'' - ^tA-1/2 

We assume that Si is pre-multiplied (or rotated) by an 
IDFT matrix JF^ of proper size. Substituting Si by JF^ Si and 
multiplying both sides of (l29T l with the DFT matrix T, we 
obtain 

Ty,^ ^J7|M TU'^ s,^ k-^l^ T z, 

= ^JJ|Mh}l^s, + A-1/2 j: . (30) 

Since A^/^ is a real matrix, the real and imaginary parts of Si 
(i = 1, 2) can now be separated for detection. 



k-^'^^{Tyd = Vp7^Ai/2gfi(s^) + ^i^z,), (31a) 
A-i/23(^yJ = Vp7mA1/23(s,) + 3(^,.). (31b) 
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We finish deriving the general decoder for 4Gp-SAST 
codes. Using (|3TT i. one can use a sphere decoder to detect the 
transmitted symbols. The equivalent channel of 4Gp-SAST 
codes is A^/^. 

C. Performance Analysis 

Note that the eigenvalues of m x m matrices Hi and H2 
can be found easily using unnormalized DFT of the channel 
vectors hi and h,2 [13]. Therefore, the eigenvalues of Hi and 
H2 have distribution ~ CA/'(0, m). 

Similar to the case of 4Gp-QSTBC, we can introduce a real 
orthogonal transformation R to the data vectors and 
(i = 1,2) to improve the performance of 4Gp-SAST 
codes. Thus the actual signal rotation of 4Gp-SAST codes is 

Since the PEP of vectors 5R(si) and (i — 1,2) are 

the same, we only calculate the PEP of the vector 3fi(si). 
Let d = ^{si). The PEP of distinct vectors d and d can 
be calculated in a similar manner to that of 4Gp-QSTBC in 
Section III-C. details are omitted for brevity. The PEP of 4Gp- 
SAST codes is given below. 



P{d 



TT 



where [(3i 



7r/2 



f3n 



n 



1 



) sin^ a 



da (32) 



R{d — d). One can find the 



asymptotic PEP of 4Gp-SAST codes at high SNR in a similar 
fashion to the case of 4Gp-QSTBC in (ISTT i as follows. 



P{d d) 



7r/2 



2m 



16! 



)17 



8!8! 



(33) 



is nonzero, 4Gp- 
Similar to 4Gp- 



Thus, if the product distance Hi™! A 
SAST codes will achieve full-diversity. 
QSTBC, with QAM, the signal rotations Rbov in [21], [22] 
can be used to minimize the worst-case PEP. 

Remark: It is interesting to recognize that, the optimal rota- 
tion matrices of 4Gp-QSTBC (i? = QRbov) and 4Gp-SAST 
codes (R = T Rbov) have a similar formula. The precoding 
matrices <d and are added to diagonalize the channels of the 
two codes. Thus each real symbol is equivalently transmitted 
in a separate channel, but full diversity is not achievable. The 
real rotation matrix Rbov is applied to the data vectors so 
that the real symbols are spread over all the channels, and thus 
full diversity is achievable. 

V. Simulation Results 

Simulation results are presented in Fig. [1] to compare the 
performances of 4Gp-QSTBC and 4Gp-SAST codes with OS- 
TBC, MDC-QSTBC [12], QSTBC [6], and SAST codes [10] 
for 6 Tx and 1 Rx antennas. To produce the desired bit rates, 
two 8QAM constellations are used. The first constellation 
is rectangular, denoted by 8QAM-R, and has signal points 
{±1 ± j, ±3 ± j}. The other constellation, denoted by 8QAM- 
S, has the best minimum Euclidean distance; its geometrical 
shape is depicted in [6, Fig. 2(c)]. 



-O- MDC-QSTBC, 16QAM, 3 bits pcu 
-O- QSTBC, 1 6QAM, 3 bits pcu 

4Gp-SAST, 8QAIV1-R, 3 bits pcu 
-□- SAST, 8QAIV1-R, 3 bits pcu 

4Gp-QSTBC, 8QAIV1-R, 3 bits pcu 
4Gp-QSTBC, 8QAI\/1-S, 3 bits pcu 
-&r- OSTBC, 8QAM-R, 2 bits pcu 
-E^ OSTBC, 8QAM-S, 2 bits pcu 
^ 4Gp-QSTBC, 4QAIV1, 2 bits pcu 
4Gp-SAST, 4QAI\/1, 2 bits pcu 
SAST. 4QAIV^, 2 bits pcu 




SNR [dB] 



Fig. 1. Performances of 4Gp-QSTBC and 4Gp-SAST codes compared with 
OSTBC, MDC-QSTBC, QSTBC and SAST codes, 6 Tx and 1 Rx antennas, 
2 and 3 bits pcu. 



We compare the performance of our new codes with OSTBC 
and SAST codes for a spectral efficiency of 2 bits pcu. To 
get this bit rate, 8QAM signals are combined with rate-2/3 
OSTBC, while 4QAM is used for the SAST, 4Gp-QSTBC and 
4Gp-SAST codes. Two columns (4 and 8) of 4Gp-QSTBC 
for 8 Tx antennas is deleted to create the code for 6 Tx 
antennas. From Fig. [U 4Gp-SAST codes gains 0.8 and 1.6 
dB over OSTBC with 8QAM-S and 8QAM-R, respectively, 
while the decoding complexity slightly increases (see Table 
The performance improvement of 4Gp-QSTBC is even better, 
1 dB compai-ed with OSTBC (using 8QAM-S) and 0.2 dB 
compared with 4Gp-SAST codes. Note that for 6 antennas, 
the decoding complexity of 4Gp-QSTBC is slightly higher 
than that of 4Gp-SAST codes (see Table |B. 

In Fig. □ the performance of 4Gp-QSTBC and 4Gp-SAST 
codes with 3 bits pcu is also compared with that of the rate-3/4 
QSTBC and MDC-QSTBC (using 16QAM). 4Gp-SAST code 
yields a 0.3 dB improvement over MDC-QSTBC and performs 
the same as QSTBC. Specifically, 4Gp-QSTBC using 8QAM- 
S performs much better than the QSTBC; it produces a 1.2 
dB gain over QSTBC with the same decoding complexity. 

Further simulations for 5 and 8 Tx antennas also confirm 
that 4Gp-QSTBC and 4Gp-SAST codes perform better than 
OSTBC, MDC-QSTBC, QSTBC, and SAST codes. Due to 
the lack of space, we omit the details. 

VI. Conclusions 

We have presented two new rate-one STBC with four- 
group decoding, called 4Gp-QSTBC and 4Gp-SAST codes. 
They offer the lowest decoding complexity compared with the 
existing rate-one STBC. Their closed-form PEP are derived, 
enabling the optimization of signal rotations. Compared with 
other existing low decoding complexity STBC (such as OS- 
TBC, MDC-QSTBC, CIOD, and QSTBC), our newly designed 
STBC have several additional advantages including higher 
code rate, better BER performance, lower encoding/decoding 



IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL, X, NO. X, X 200X 



7 



delay, and lower peak-to-average power ratio (PAPR) because 
zero-amplitude symbols are avoided in the code matrices. 
Recent results in [23] present a flexible design of multi-group 
STBC. However, the code rate is still limited by 1 symbol pcu. 
Thus, the systematic design of high-rate multi-group STBC is 
stiU an open research problem. 
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